Dimension reduction of high-dimensional dataset with missing values
نویسندگان
چکیده
منابع مشابه
Sufficient Dimension Reduction With Missing Predictors
In high-dimensional data analysis, sufficient dimension reduction (SDR) methods are effective in reducing the predictor dimension, while retaining full regression information and imposing no parametric models. However, it is common in high-dimensional data that a subset of predictors may have missing observations. Existing SDR methods resort to the complete-case analysis by removing all the sub...
متن کاملDimension reduction for high-dimensional data.
With advancing of modern technologies, high-dimensional data have prevailed in computational biology. The number of variables p is very large, and in many applications, p is larger than the number of observational units n. Such high dimensionality and the unconventional small-n-large-p setting have posed new challenges to statistical analysis methods. Dimension reduction, which aims to reduce t...
متن کاملDimensionality reduction with missing values imputation
In this study, we propose a new statical approach for high-dimensionality reduction of heterogenous data that limits the curse of dimensionality and deals with missing values. To handle these latter, we propose to use the Random Forest imputation’s method. The main purpose here is to extract useful information and so reducing the search space to facilitate the data exploration process. Several ...
متن کاملApplying data mining algorithms to inpatient dataset with missing values
Purpose – Data preparation plays an important role in data mining as most real life data sets contained missing data. This paper aims to investigate different treatment methods for missing data. Design/methodology/approach – This paper introduces, analyses and compares well-established treatment methods for missing data and proposes new methods based on naı̈ve Bayesian classifier. These methods ...
متن کاملIndexing Multi-Dimensional Data with Missing Values
Advanced analytical studies are usually conducted on data with many dimensions. However, the large number of attributes associated with each data object naturally leads to situations where not all values are available. This paper presents a novel solution to the problem of retrieving multi-dimensional data with missing values based on region queries. The key aspect of the solution is that it e ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of Algorithms & Computational Technology
سال: 2019
ISSN: 1748-3026,1748-3026
DOI: 10.1177/1748302619867440